View Post [edit]
Poster: | molly | Date: | Mar 27, 2005 1:15pm |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
Basically, create a text file called robots.txt in notepad or whatever text editor you use and stick this in it:
User-agent: ia_archiver
Disallow:
User-agent: *
Disallow: /
Basically this means ia_archiver can get to everything, but you are disallowing everybody else, starting at the top level of your directories. You can specifically name the other robots you want to disallow if you want. Stick this file in the top level of your website, and you are good to go!
Here are examples of some other site's robots.txt files:
archive.org/robots.txt
nytimes.com/robots.txt
cnn.com/robots.txt
bbc.co.uk/robots.txt
craigslist.org/robots.txt
slashdot.org/robots.txt
Reply [edit]
Poster: | NoArchive | Date: | Mar 28, 2005 10:43am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
Reply [edit]
Poster: | Igor Ranitovic | Date: | Mar 29, 2005 1:12am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
- Courtesy never hurts -- it might help
- Make your posting simple to reply
- Be explicit about the question you have
- Follow up with the solution
Reply [edit]
Poster: | molly | Date: | Mar 29, 2005 7:00am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
To block specific UserAgents while allowing access to others, robots.txt is a good choice. This will also allow you to control robot access from a central location instead of managing this tag in a large number of documents.
Thanks for using our services.
Reply [edit]
Poster: | Bob_Dratch | Date: | May 13, 2005 1:58am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
I highly suggest a CLASS action suit be brought out against these archivers who take stuff without a copyright owner's permission. I've seen MANY commercial pages republished by this "service" without the owner's permission. It's not up to the copyright owner to tell these people to remove stolen material. The GLOBAL copyright protection laws protect the copyright owner - groups like this are not honoring copyright law, and really need to be addressed legally.
There is nothing noble about a thief.
This post was modified by Bob_Dratch on 2005-05-13 08:58:15
Reply [edit]
Poster: | ellenlangsetmo | Date: | Jul 31, 2007 10:37am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
Reply [edit]
Poster: | molly | Date: | May 13, 2005 2:16am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
We'd be happy to remove any of your pages from the Wayback Machine. Please email us at info@archive.org.
-best
Molly
Reply [edit]
Poster: | smartsight | Date: | Jun 16, 2006 4:56am |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |
(Note that robots.txt does not let one make any distinction between caching and indexing, so that does not help either.)
Thanks.
This post was modified by smartsight on 2006-06-16 11:55:18
This post was modified by smartsight on 2006-06-16 11:56:14
Reply [edit]
Poster: | simon c | Date: | Mar 28, 2005 12:43pm |
Forum: | web | Subject: | Re: Robots archive - noarchive META tags |